CNTRLPLANE-2589: Migrate to OTE support,fix pod security,Pathological Event monitor failures#859
Conversation
|
@ropatil010: This pull request references CNTRLPLANE-2589 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR refactors and significantly expands encryption and OIDC e2e test suites, extracts shared test implementations into a new helper layer, makes the Keycloak client context-aware across all operations, updates test library signatures for consistent context-first parameter ordering, and hardens pod security for identity provider test deployments. ChangesEncryption E2E Test Suite
OIDC E2E Test Suite
Test Library Refactoring and Enhancements
IDP Deployment Security and Cleanup
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 3 warnings, 1 inconclusive)
✅ Passed checks (10 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
debcb9a to
9541074
Compare
|
Before fix in CI logs we were hitting the same error which was mentioned in the Reverting PR: #857 After fix the CI logs got pass /assign @liouk @stbenjam |
|
@ropatil010 please correct me if I'm wrong, but this issue seems that it's caused by OTE itself, is there a plan on fixing that on the OTE side? |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
test/e2e-oidc/external_oidc.go (1)
441-470: Consider extending algorithm support for future-proofing.The
extractRSAPubKeyFunconly handles RS256. While this matches current Keycloak defaults, if the IDP configuration changes to use RS384/RS512, tests would fail with an unhelpful error. This is acceptable for now but worth noting.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-oidc/external_oidc.go` around lines 441 - 470, extractRSAPubKeyFunc currently only accepts "RS256" and returns an error for other RSA algorithms; update it to accept other RSA signing algorithms (e.g., "RS384", "RS512") or any algorithm that starts with "RS" so the JWKS parsing is future-proof: modify the switch on key.Alg in extractRSAPubKeyFunc (and the related issuerJWKS/Key usage) to handle "RS384" and "RS512" (or use a strings.HasPrefix(key.Alg, "RS") check) and keep constructing the rsa.PublicKey the same way, while keeping the unexpected-algorithm error for non-RS algorithms. Ensure the error messages still include key.KID and key.Alg for clarity.test/e2e-encryption-perf/encryption_perf.go (1)
55-66: Simplify error handling with directt.Fatalcalls.The pattern of creating an error and immediately passing it to
require.NoErroris confusing. Since the error is always non-nil when created,require.NoErrorwill always fail, making this equivalent tot.Fatalbut harder to read.♻️ Proposed simplification
AssertDBPopulatedFunc: func(t testing.TB, errorStore map[string]int, statStore map[string]int) { tokenCount, ok := statStore[tokenStatsKey] if !ok { - err := errors.New("missing oauth access tokens count stats, can't continue the test") - require.NoError(t, err) + t.Fatal("missing oauth access tokens count stats, can't continue the test") } if tokenCount < 14000 { - err := fmt.Errorf("expected to create at least 14000 tokens but %d were created", tokenCount) - require.NoError(t, err) + t.Fatalf("expected to create at least 14000 tokens but %d were created", tokenCount) } t.Logf("Created %d access tokens", tokenCount) },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-encryption-perf/encryption_perf.go` around lines 55 - 66, In AssertDBPopulatedFunc simplify error handling by replacing the pattern that constructs an error and calls require.NoError with direct test failures: when statStore[tokenStatsKey] is missing call t.Fatalf (or require.FailNow) with a clear message about the missing oauth access tokens count, and when tokenCount < 14000 call t.Fatalf (or require.FailNow) with a message showing the expected vs actual tokenCount; update references around AssertDBPopulatedFunc and tokenStatsKey accordingly.test/e2e-encryption-rotation/encryption_rotation.go (1)
47-75: Consider adding retry logic for the operator update.The
UnsupportedConfigFuncperforms a read-modify-write operation without handling potential conflict errors. If the operator object is modified between theGet(line 49) andUpdate(line 73), the update will fail with a conflict. For a 3-hour test, this could cause intermittent failures.If conflicts are expected to be rare in this test context, the current implementation is acceptable, but retry-on-conflict would improve reliability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-encryption-rotation/encryption_rotation.go` around lines 47 - 75, Wrap the read-modify-write in UnsupportedConfigFunc with retry-on-conflict logic (e.g., use k8s.io/client-go/util/retry.RetryOnConflict) so that if cs.OperatorClient.Update returns a conflict you re-get the Operator (cs.OperatorClient.Get), reapply the unsupportedConfig merge steps and retry the Update; ensure the loop returns the final error if retries are exhausted and keep references to UnsupportedConfigFunc, cs.OperatorClient.Get, and cs.OperatorClient.Update to locate the change.test/e2e-encryption/encryption.go (1)
54-74: Consider reusing the providedClientSetparameter.The
CreateResourceFunccallback receives alibrary.ClientSetparameter but ignores it, creating a new client viaoperatorencryption.GetClients(t). If this is intentional (e.g., needing operator-specific client configuration), adding a brief comment would clarify the design choice.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-encryption/encryption.go` around lines 54 - 74, The CreateResourceFunc inside testEncryptionTurnOnAndOff ignores the provided library.ClientSet parameter and calls operatorencryption.GetClients(t); either update CreateResourceFunc to use the passed-in client (the second parameter) when calling operatorencryption.CreateAndStoreTokenOfLife (or adapt/convert the library.ClientSet to the expected client type), or if using GetClients(t) is intentional, add a brief comment above CreateResourceFunc explaining why operatorencryption.GetClients(t) is required instead of the supplied client; reference CreateResourceFunc, operatorencryption.CreateAndStoreTokenOfLife, and operatorencryption.GetClients for locating the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@go.mod`:
- Around line 136-137: The go.mod replace pointing to the forked module
(github.com/ropatil010/openshift-tests-extension
v0.0.0-20260406095036-3a490698dab7) introduces supply-chain and reproducibility
risk; update the repository by adding a clear TODO comment in go.mod and create
a tracking issue linking upstream PR `#60` ("feat: configure klog to write to
stderr") so we revert this replace as soon as that PR is merged, then remove the
replace and restore github.com/openshift-eng/openshift-tests-extension; also add
a CI/release gate (or note in the release checklist) to block merging release
candidates until the follow-up removal is completed.
In `@test/e2e-encryption-rotation/encryption_rotation.go`:
- Around line 27-28: Fix the typo in the comment above testEncryptionRotation:
change "encyrption.Reason" to "encryption.Reason" in the comment that describes
forcing a key rotation so the term is spelled correctly and matches the
configuration field name.
In `@test/library/encryption/perf_helpers.go`:
- Around line 35-46: The predicate passed to wait.PollUntilContextTimeout
currently sends any EncryptionMigrationControllerProgressing=True timestamp even
if it predates the test and it never checks the cond.Reason; update the
predicate used by wait.PollUntilContextTimeout in perf_helpers.go (the anonymous
func that calls getOperatorConditionsFn and writes to migrationStartedCh) to
also require cond.Reason == "Migrating" and that cond.LastTransitionTime is
strictly after a test start time captured before the poll (e.g., record start :=
time.Now() or start := time.Now().UTC() before calling
wait.PollUntilContextTimeout, then only send cond.LastTransitionTime.Time if
cond.Type == "EncryptionMigrationControllerProgressing" && cond.Status ==
operatorv1.ConditionTrue && cond.Reason == "Migrating" &&
cond.LastTransitionTime.Time.After(start)). Ensure the log message remains
meaningful when the newer transition is observed.
In `@test/library/encryption/scenarios.go`:
- Around line 189-210: The test uses a non-blocking select to read
migrationStartedCh which can fall through before the watcher goroutine sends;
change this to wait for the signal with a bounded timeout instead of using the
default case: replace the select block reading migrationStartedCh with a
blocking receive combined with a timeout (e.g., select { case migrationStarted
:= <-migrationStartedCh: ... case <-time.After(timeout): tb.Error(...) }) or use
a context with deadline that you cancel when runTestEncryptionPerf returns;
ensure you reference the watcher by keeping migrationStartedCh passed to
watchForMigrationControllerProgressingConditionAsync and use that same channel
for the bounded wait so the watcher is not left running indefinitely.
- Around line 72-83: The switch in TestEncryptionType currently routes the
empty-string provider ("") to TestEncryptionTypeIdentity which causes
APIServer.Spec.Encryption.Type to be set to "identity"; change the dispatch so
that case configv1.EncryptionTypeIdentity handles the explicit identity value
and the empty-string case ("") dispatches to TestEncryptionTypeUnset instead.
Update the switch in the TestEncryptionType function (the cases for
configv1.EncryptionTypeIdentity and "") so that "" calls
TestEncryptionTypeUnset(tb, scenario) while leaving the explicit Identity case
calling TestEncryptionTypeIdentity(tb, scenario).
- Around line 340-349: The APIServer Get+Update can fail with resourceVersion
conflicts; wrap the read-modify-update of apiServer (performed via
libClientSet.ApiServerConfig.Get and Update) inside a retry.RetryOnConflict
loop: import "k8s.io/client-go/util/retry", use retry.RetryOnConflict to
repeatedly Get the APIServer, set apiServer.Spec.Encryption.Type =
encryptionType when changed, and call Update until it succeeds or the retry
gives up; ensure errors from Update are returned/checked with
require.NoError(tb, err) as before.
---
Nitpick comments:
In `@test/e2e-encryption-perf/encryption_perf.go`:
- Around line 55-66: In AssertDBPopulatedFunc simplify error handling by
replacing the pattern that constructs an error and calls require.NoError with
direct test failures: when statStore[tokenStatsKey] is missing call t.Fatalf (or
require.FailNow) with a clear message about the missing oauth access tokens
count, and when tokenCount < 14000 call t.Fatalf (or require.FailNow) with a
message showing the expected vs actual tokenCount; update references around
AssertDBPopulatedFunc and tokenStatsKey accordingly.
In `@test/e2e-encryption-rotation/encryption_rotation.go`:
- Around line 47-75: Wrap the read-modify-write in UnsupportedConfigFunc with
retry-on-conflict logic (e.g., use k8s.io/client-go/util/retry.RetryOnConflict)
so that if cs.OperatorClient.Update returns a conflict you re-get the Operator
(cs.OperatorClient.Get), reapply the unsupportedConfig merge steps and retry the
Update; ensure the loop returns the final error if retries are exhausted and
keep references to UnsupportedConfigFunc, cs.OperatorClient.Get, and
cs.OperatorClient.Update to locate the change.
In `@test/e2e-encryption/encryption.go`:
- Around line 54-74: The CreateResourceFunc inside testEncryptionTurnOnAndOff
ignores the provided library.ClientSet parameter and calls
operatorencryption.GetClients(t); either update CreateResourceFunc to use the
passed-in client (the second parameter) when calling
operatorencryption.CreateAndStoreTokenOfLife (or adapt/convert the
library.ClientSet to the expected client type), or if using GetClients(t) is
intentional, add a brief comment above CreateResourceFunc explaining why
operatorencryption.GetClients(t) is required instead of the supplied client;
reference CreateResourceFunc, operatorencryption.CreateAndStoreTokenOfLife, and
operatorencryption.GetClients for locating the change.
In `@test/e2e-oidc/external_oidc.go`:
- Around line 441-470: extractRSAPubKeyFunc currently only accepts "RS256" and
returns an error for other RSA algorithms; update it to accept other RSA signing
algorithms (e.g., "RS384", "RS512") or any algorithm that starts with "RS" so
the JWKS parsing is future-proof: modify the switch on key.Alg in
extractRSAPubKeyFunc (and the related issuerJWKS/Key usage) to handle "RS384"
and "RS512" (or use a strings.HasPrefix(key.Alg, "RS") check) and keep
constructing the rsa.PublicKey the same way, while keeping the
unexpected-algorithm error for non-RS algorithms. Ensure the error messages
still include key.KID and key.Alg for clarity.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0a512169-a204-4df2-92e6-760ec43345c5
⛔ Files ignored due to path filters (17)
go.sumis excluded by!**/*.sumvendor/github.com/openshift-eng/openshift-tests-extension/pkg/cmd/cmd.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/openshift-eng/openshift-tests-extension/pkg/cmd/cmdrun/runsuite.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/openshift-eng/openshift-tests-extension/pkg/extension/extensiontests/result.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/openshift-eng/openshift-tests-extension/pkg/extension/extensiontests/result_writer.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/openshift-eng/openshift-tests-extension/pkg/extension/extensiontests/viewer.htmlis excluded by!vendor/**,!**/vendor/**vendor/github.com/openshift-eng/openshift-tests-extension/pkg/ginkgo/logging.gois excluded by!vendor/**,!**/vendor/**vendor/k8s.io/klog/v2/README.mdis excluded by!vendor/**,!**/vendor/**vendor/k8s.io/klog/v2/internal/serialize/keyvalues.gois excluded by!vendor/**,!**/vendor/**vendor/k8s.io/klog/v2/internal/serialize/keyvalues_no_slog.gois excluded by!vendor/**,!**/vendor/**vendor/k8s.io/klog/v2/internal/serialize/keyvalues_slog.gois excluded by!vendor/**,!**/vendor/**vendor/k8s.io/klog/v2/klog.gois excluded by!vendor/**,!**/vendor/**vendor/k8s.io/klog/v2/klogr.gois excluded by!vendor/**,!**/vendor/**vendor/k8s.io/klog/v2/klogr_slog.gois excluded by!vendor/**,!**/vendor/**vendor/k8s.io/klog/v2/textlogger/options.gois excluded by!vendor/**,!**/vendor/**vendor/k8s.io/klog/v2/textlogger/textlogger.gois excluded by!vendor/**,!**/vendor/**vendor/modules.txtis excluded by!vendor/**,!**/vendor/**
📒 Files selected for processing (22)
cmd/cluster-authentication-operator-tests-ext/main.gogo.modpkg/controllers/readiness/wellknown_ready_controller.gotest/e2e-encryption-kms/encryption_kms.gotest/e2e-encryption-kms/encryption_kms_test.gotest/e2e-encryption-kms/main_test.gotest/e2e-encryption-perf/encryption_perf.gotest/e2e-encryption-perf/encryption_perf_test.gotest/e2e-encryption-rotation/e2e-encryption-rotation_test.gotest/e2e-encryption-rotation/encryption_rotation.gotest/e2e-encryption/encryption.gotest/e2e-encryption/encryption_test.gotest/e2e-encryption/main_test.gotest/e2e-oidc/external_oidc.gotest/e2e-oidc/external_oidc_test.gotest/library/client.gotest/library/encryption/helpers.gotest/library/encryption/perf_helpers.gotest/library/encryption/scenarios.gotest/library/encryption_wrappers.gotest/library/featuregates.gotest/library/waits.go
💤 Files with no reviewable changes (2)
- test/e2e-encryption/main_test.go
- test/e2e-encryption-kms/main_test.go
4342519 to
8f115d7
Compare
This commit migrates five test suites (e2e-encryption, e2e-encryption-kms, e2e-encryption-perf, e2e-encryption-rotation, e2e-oidc) from standalone test binaries to the OpenShift Tests Extension (OTE) framework using Ginkgo v2. Key changes: - Migrate tests to Ginkgo v2 with OTE integration - Extract test logic into dedicated helper files for better organization - Remove obsolete main_test.go files in favor of Ginkgo suite setup - Configure appropriate test suite parallelism and qualifiers - Add klog stderr redirection to fix JSON output corruption Critical fix for klog output corruption: - Add klog.SetOutput(os.Stderr) in test binary's main() to prevent klog warnings from corrupting JSON output during 'openshift-tests list' commands - This fixes "invalid character 'W'" parsing errors in CI that caused PR openshift#839 to be reverted in PR openshift#857 - Ensures stdout contains only clean JSON for CI consumption while klog warnings go to stderr Test organization: - operator/parallel: Non-serial operator, templates, and tokens tests (parallelism: 4) - operator/serial: Serial operator, OIDC, templates, and tokens tests - operator/disruptive: Tests marked as [Disruptive] - operator-encryption/serial: Basic encryption tests - operator-encryption-rotation/serial: Encryption key rotation tests - operator-encryption-perf/serial: Encryption performance tests - operator-encryption-kms/serial: KMS encryption tests Dependencies: - Use standard OTE version v0.0.0-20251205182537-ff5553e56f33 - Updated klog dependency to v2.140.0 - No fork dependencies See: openshift#857 (revert) openshift#859 (this PR) Co-Authored-By: Rohit Patil <[email protected]>
8f115d7 to
6c90e52
Compare
Restore the init package approach to fix 'invalid character W looking for beginning of value' errors in CI when listing tests. The previous fix using klog.SetOutput(os.Stderr) in main() was too late - klog warnings were already generated during Go's init phase, corrupting the JSON output before main() could run. This commit implements a two-phase approach: 1. Init phase (pkg/test/init): Suppress all klog output using io.Discard - Runs during Go initialization, before any dependency can trigger warnings - Prevents warnings from appearing in stdout during 'openshift-tests list' 2. Main phase: Reconfigure klog to stderr for normal test execution - Enables proper logging after initialization completes - Keeps stdout clean for JSON output This is the same approach that made commit 9541074 pass in CI, but was inadvertently removed in commit 6c90e52, causing test failures. Fixes the root cause of CI failures in PR openshift#859 See: openshift#857 Co-Authored-By: Rohit Patil <[email protected]>
|
Waiting for PR: openshift/origin#31009 to gets merged which fixes the issue "error: encountered errors while listing tests: invalid character 'W' looking for beginning of value". The above mentioned issue has been fixed in this PR using init things but looks like it is not acceptable. |
753ca4a to
aa529ed
Compare
|
@ropatil010: This pull request references CNTRLPLANE-2589 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (4)
test/library/encryption/scenarios.go (3)
340-349:⚠️ Potential issue | 🟠 MajorRetry the APIServer update on conflict.
This is still a single read-modify-write against
apiserver/cluster. If anything updates the object betweenGetandUpdate, the test fails on a transientresourceVersionconflict even though retrying would succeed. Wrap the mutation inretry.RetryOnConflicthere.Suggested fix
+ "k8s.io/client-go/util/retry"- apiServer, err := libClientSet.ApiServerConfig.Get(context.TODO(), "cluster", metav1.GetOptions{}) - require.NoError(tb, err) - - // Update encryption type if needed - needsUpdate := apiServer.Spec.Encryption.Type != encryptionType - if needsUpdate { - tb.Logf("Updating encryption type in the config file for APIServer to %q", encryptionType) - apiServer.Spec.Encryption.Type = encryptionType - _, err = libClientSet.ApiServerConfig.Update(context.TODO(), apiServer, metav1.UpdateOptions{}) - require.NoError(tb, err) + apiServer, err := libClientSet.ApiServerConfig.Get(context.TODO(), "cluster", metav1.GetOptions{}) + require.NoError(tb, err) + + needsUpdate := apiServer.Spec.Encryption.Type != encryptionType + if needsUpdate { + tb.Logf("Updating encryption type in the config file for APIServer to %q", encryptionType) + err = retry.RetryOnConflict(retry.DefaultRetry, func() error { + apiServer, err = libClientSet.ApiServerConfig.Get(context.TODO(), "cluster", metav1.GetOptions{}) + if err != nil { + return err + } + apiServer.Spec.Encryption.Type = encryptionType + _, err = libClientSet.ApiServerConfig.Update(context.TODO(), apiServer, metav1.UpdateOptions{}) + return err + }) + require.NoError(tb, err) } else { tb.Logf("APIServer is already configured to use %q mode", encryptionType) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/library/encryption/scenarios.go` around lines 340 - 349, The update to the APIServer object does a single read-modify-write and can fail on resourceVersion conflicts; wrap the mutation+Update call in a retry.RetryOnConflict loop so it re-reads and retries on conflict. Locate the block that calls libClientSet.ApiServerConfig.Get and then sets apiServer.Spec.Encryption.Type and calling libClientSet.ApiServerConfig.Update, and replace that update path with a retry.RetryOnConflict that fetches the latest ApiServer, applies the same encryptionType mutation to apiServer.Spec.Encryption.Type, and retries the Update until it succeeds or a non-conflict error occurs.
204-210:⚠️ Potential issue | 🟠 MajorDon't use a non-blocking receive for
migrationStartedCh.If the watcher goroutine sends a moment later, this
defaultpath reports a false failure and never computes the migration duration. Wait on the same channel with a bounded timeout or cancelable context instead of falling through immediately.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/library/encryption/scenarios.go` around lines 204 - 210, The non-blocking select on migrationStartedCh can cause false failures; replace the select/default with a blocking receive that waits with a bounded timeout or a cancelable context so the test actually observes the watcher send. Specifically, wait for migrationStartedCh (instead of the default path) and compute migrationTime = endTimeStamp.Sub(migrationStarted) before calling tb.Logf and scenario.AssertMigrationTime; if the timeout/context expires, call tb.Error with a clear timeout message. Use the existing symbols migrationStartedCh, endTimeStamp, tb.Logf, and scenario.AssertMigrationTime to locate and update the code.
73-81:⚠️ Potential issue | 🟠 MajorDispatch
""toTestEncryptionTypeUnset.The empty-string provider still falls through to
TestEncryptionTypeIdentity, which writes"identity"intoAPIServer.Spec.Encryption.Typeinstead of exercising the unset/defaulting path that this helper already models separately.Suggested fix
switch provider { case configv1.EncryptionTypeAESCBC: TestEncryptionTypeAESCBC(tb, scenario) case configv1.EncryptionTypeAESGCM: TestEncryptionTypeAESGCM(tb, scenario) case configv1.EncryptionTypeKMS: TestEncryptionTypeKMS(tb, scenario) - case configv1.EncryptionTypeIdentity, "": + case configv1.EncryptionTypeIdentity: TestEncryptionTypeIdentity(tb, scenario) + case "": + TestEncryptionTypeUnset(tb, scenario) default: tb.Fatalf("Unknown encryption type: %s", provider) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/library/encryption/scenarios.go` around lines 73 - 81, The switch in the dispatcher incorrectly routes the empty-string provider to TestEncryptionTypeIdentity; update the case handling so that configv1.EncryptionTypeIdentity maps only to TestEncryptionTypeIdentity and the empty string ("") maps to TestEncryptionTypeUnset. Locate the switch in TestEncryption scenarios (the switch on provider in scenarios.go) and replace the combined case `case configv1.EncryptionTypeIdentity, "": TestEncryptionTypeIdentity(tb, scenario)` with two separate cases: one calling TestEncryptionTypeIdentity(tb, scenario) for configv1.EncryptionTypeIdentity and one calling TestEncryptionTypeUnset(tb, scenario) for the "" provider; leave the other cases (TestEncryptionTypeAESCBC, TestEncryptionTypeAESGCM, TestEncryptionTypeKMS) unchanged.test/library/encryption/perf_helpers.go (1)
35-46:⚠️ Potential issue | 🟠 MajorOnly capture a fresh
"Migrating"transition.This still records any
EncryptionMigrationControllerProgressing=Truecondition, even if it predates the current test or has a different reason. That can inflate the reported migration duration or use a stale start time. Capture astart := time.Now()before polling and only sendLastTransitionTimewhen the condition isTrue,Reason == "Migrating", and the transition is newer thanstart.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/library/encryption/perf_helpers.go` around lines 35 - 46, The poll currently sends any EncryptionMigrationControllerProgressing condition with Status True, possibly using a stale LastTransitionTime; fix by recording a start := time.Now() immediately before calling wait.PollUntilContextTimeout and inside the poll only send cond.LastTransitionTime to migrationStartedCh when cond.Type == "EncryptionMigrationControllerProgressing", cond.Status == operatorv1.ConditionTrue, cond.Reason == "Migrating" and cond.LastTransitionTime.Time.After(start); use getOperatorConditionsFn to fetch conditions and ensure you return true,nil only when the transition is newer than start.
🧹 Nitpick comments (3)
pkg/controllers/readiness/wellknown_ready_controller.go (1)
355-358: Optional: cache KAS service port once per function for readability.Line 357 is correct, but pulling
getKASServicePort()out of the loop makes intent clearer and avoids repeated calls in the condition.♻️ Proposed refactor
func getKASTargetPortFromService(service *corev1.Service) (int, bool) { + kasPort := getKASServicePort() for _, port := range service.Spec.Ports { - if targetPort := port.TargetPort.IntValue(); targetPort != 0 && port.Protocol == corev1.ProtocolTCP && int(port.Port) == getKASServicePort() { + if targetPort := port.TargetPort.IntValue(); targetPort != 0 && port.Protocol == corev1.ProtocolTCP && int(port.Port) == kasPort { return targetPort, true } } return 0, false }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controllers/readiness/wellknown_ready_controller.go` around lines 355 - 358, The loop in getKASTargetPortFromService repeatedly calls getKASServicePort() inside the condition which obscures intent; cache the result once before iterating by assigning kasPort := getKASServicePort(), then use kasPort in the loop condition (referencing getKASTargetPortFromService and getKASServicePort) so the code is clearer and avoids repeated calls.test/e2e-encryption-perf/encryption_perf.go (2)
55-65: Consider usingt.Fatalfdirectly instead ofrequire.NoErrorwith fabricated errors.The current pattern creates an error just to pass it to
require.NoError, which is indirect. Usingt.Fatalfwould be more idiomatic:♻️ Suggested simplification
AssertDBPopulatedFunc: func(t testing.TB, errorStore map[string]int, statStore map[string]int) { tokenCount, ok := statStore[tokenStatsKey] if !ok { - err := errors.New("missing oauth access tokens count stats, can't continue the test") - require.NoError(t, err) + t.Fatalf("missing oauth access tokens count stats, can't continue the test") } if tokenCount < 14000 { - err := fmt.Errorf("expected to create at least 14000 tokens but %d were created", tokenCount) - require.NoError(t, err) + t.Fatalf("expected to create at least 14000 tokens but %d were created", tokenCount) } t.Logf("Created %d access tokens", tokenCount) },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-encryption-perf/encryption_perf.go` around lines 55 - 65, The AssertDBPopulatedFunc currently fabricates errors and calls require.NoError; change those to direct test failures using t.Fatalf: when tokenStatsKey is missing, call t.Fatalf("missing oauth access tokens count stats, can't continue the test"); when tokenCount < 14000, call t.Fatalf("expected to create at least 14000 tokens but %d were created", tokenCount). Update the block in AssertDBPopulatedFunc (which reads tokenCount, ok := statStore[tokenStatsKey]) to remove the temporary err variables and use t.Fatalf with the appropriate messages.
81-97: Unusednamespaceparameter in the returned function.The
namespaceparameter on line 82 is unused. If this is intentional (OAuth access tokens are cluster-scoped), consider using_to make it explicit that it's intentionally ignored for clarity, consistent with the other unused parameters.♻️ Suggested clarification
func createAccessTokenWrapper(ctx context.Context, tokenClient oauthclient.OAuthAccessTokensGetter) library.DBLoaderFuncType { - return func(_ kubernetes.Interface, namespace string, errorCollector func(error), statsCollector func(string)) error { + return func(_ kubernetes.Interface, _ string, errorCollector func(error), statsCollector func(string)) error {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-encryption-perf/encryption_perf.go` around lines 81 - 97, The returned anonymous function in createAccessTokenWrapper currently declares a parameter named namespace that is never used; update the signature of the returned func (in createAccessTokenWrapper) to replace the unused namespace parameter with _ (e.g., func(_ kubernetes.Interface, _ string, errorCollector func(error), statsCollector func(string)) error) so it's explicit the namespace is intentionally ignored while keeping the rest of the behavior the same.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e-oidc/external_oidc.go`:
- Line 75: The suite is being gated incorrectly: change the
CheckFeatureGatesOrSkip call so it only requires FeatureGateExternalOIDC (i.e.,
call CheckFeatureGatesOrSkip(t, testCtx, testClient.configClient,
features.FeatureGateExternalOIDC) instead of passing both gates), and remove
FeatureGateExternalOIDCWithAdditionalClaimMappings from the top-level gate; for
tests that need the additional-claim behavior, add local calls to
CheckFeatureGatesOrSkip (or explicit checks) inside those specific subtests
before exercising code paths that depend on
FeatureGateExternalOIDCWithAdditionalClaimMappings; update references around
CheckFeatureGatesOrSkip and any subtest bodies that currently assume the extra
gate is globally enabled.
- Around line 111-118: The cleanup currently always calls
test.WaitForNewKASRollout after testClient.authResourceRollback even when
authResourceRollback was a no-op; change the cleanup closure (around
kasOriginalRevision, testClient.authResourceRollback, and
test.WaitForNewKASRollout) to detect whether the rollback actually persisted a
change and only call WaitForNewKASRollout when it did. Implement this by either:
(A) updating authResourceRollback to return an indicator (e.g., bool changed) in
addition to error and branch on that, or (B) after authResourceRollback, fetch
the current authentication spec (same shape as origAuthSpec) and compare it to
origAuthSpec and only invoke test.WaitForNewKASRollout if they differ; ensure
require.NoError checks still run for the rollback error but skip the rollout
wait when no spec change occurred.
- Around line 554-568: The informers are started with a non-cancelable context,
leaving goroutines running; change the Start/WaitForCacheSync usage to use a
cancelable context derived via context.WithCancel (derive cancelCtx, cancel :=
context.WithCancel(ctx) or similar) before calling
dynamicInformers.Start(cancelCtx.Done()) and
dynamicInformers.WaitForCacheSync(cancelCtx.Done()), and register cleanup to
stop them by either calling cancel() during test teardown or storing
dynamicInformers on the testClient and calling
dynamicInformers.Shutdown()/Shutdown in the teardown; update the code paths
around NewClusterScopedOperatorClient, dynamicInformers, Start,
WaitForCacheSync, and testClient to ensure the cancel function or Shutdown is
invoked in cleanup.
---
Duplicate comments:
In `@test/library/encryption/perf_helpers.go`:
- Around line 35-46: The poll currently sends any
EncryptionMigrationControllerProgressing condition with Status True, possibly
using a stale LastTransitionTime; fix by recording a start := time.Now()
immediately before calling wait.PollUntilContextTimeout and inside the poll only
send cond.LastTransitionTime to migrationStartedCh when cond.Type ==
"EncryptionMigrationControllerProgressing", cond.Status ==
operatorv1.ConditionTrue, cond.Reason == "Migrating" and
cond.LastTransitionTime.Time.After(start); use getOperatorConditionsFn to fetch
conditions and ensure you return true,nil only when the transition is newer than
start.
In `@test/library/encryption/scenarios.go`:
- Around line 340-349: The update to the APIServer object does a single
read-modify-write and can fail on resourceVersion conflicts; wrap the
mutation+Update call in a retry.RetryOnConflict loop so it re-reads and retries
on conflict. Locate the block that calls libClientSet.ApiServerConfig.Get and
then sets apiServer.Spec.Encryption.Type and calling
libClientSet.ApiServerConfig.Update, and replace that update path with a
retry.RetryOnConflict that fetches the latest ApiServer, applies the same
encryptionType mutation to apiServer.Spec.Encryption.Type, and retries the
Update until it succeeds or a non-conflict error occurs.
- Around line 204-210: The non-blocking select on migrationStartedCh can cause
false failures; replace the select/default with a blocking receive that waits
with a bounded timeout or a cancelable context so the test actually observes the
watcher send. Specifically, wait for migrationStartedCh (instead of the default
path) and compute migrationTime = endTimeStamp.Sub(migrationStarted) before
calling tb.Logf and scenario.AssertMigrationTime; if the timeout/context
expires, call tb.Error with a clear timeout message. Use the existing symbols
migrationStartedCh, endTimeStamp, tb.Logf, and scenario.AssertMigrationTime to
locate and update the code.
- Around line 73-81: The switch in the dispatcher incorrectly routes the
empty-string provider to TestEncryptionTypeIdentity; update the case handling so
that configv1.EncryptionTypeIdentity maps only to TestEncryptionTypeIdentity and
the empty string ("") maps to TestEncryptionTypeUnset. Locate the switch in
TestEncryption scenarios (the switch on provider in scenarios.go) and replace
the combined case `case configv1.EncryptionTypeIdentity, "":
TestEncryptionTypeIdentity(tb, scenario)` with two separate cases: one calling
TestEncryptionTypeIdentity(tb, scenario) for configv1.EncryptionTypeIdentity and
one calling TestEncryptionTypeUnset(tb, scenario) for the "" provider; leave the
other cases (TestEncryptionTypeAESCBC, TestEncryptionTypeAESGCM,
TestEncryptionTypeKMS) unchanged.
---
Nitpick comments:
In `@pkg/controllers/readiness/wellknown_ready_controller.go`:
- Around line 355-358: The loop in getKASTargetPortFromService repeatedly calls
getKASServicePort() inside the condition which obscures intent; cache the result
once before iterating by assigning kasPort := getKASServicePort(), then use
kasPort in the loop condition (referencing getKASTargetPortFromService and
getKASServicePort) so the code is clearer and avoids repeated calls.
In `@test/e2e-encryption-perf/encryption_perf.go`:
- Around line 55-65: The AssertDBPopulatedFunc currently fabricates errors and
calls require.NoError; change those to direct test failures using t.Fatalf: when
tokenStatsKey is missing, call t.Fatalf("missing oauth access tokens count
stats, can't continue the test"); when tokenCount < 14000, call
t.Fatalf("expected to create at least 14000 tokens but %d were created",
tokenCount). Update the block in AssertDBPopulatedFunc (which reads tokenCount,
ok := statStore[tokenStatsKey]) to remove the temporary err variables and use
t.Fatalf with the appropriate messages.
- Around line 81-97: The returned anonymous function in createAccessTokenWrapper
currently declares a parameter named namespace that is never used; update the
signature of the returned func (in createAccessTokenWrapper) to replace the
unused namespace parameter with _ (e.g., func(_ kubernetes.Interface, _ string,
errorCollector func(error), statsCollector func(string)) error) so it's explicit
the namespace is intentionally ignored while keeping the rest of the behavior
the same.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: be01b227-071d-425e-b47e-7ae44ef825ce
⛔ Files ignored due to path filters (11)
go.sumis excluded by!**/*.sumvendor/k8s.io/klog/v2/README.mdis excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/internal/serialize/keyvalues.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/internal/serialize/keyvalues_no_slog.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/internal/serialize/keyvalues_slog.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/klog.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/klogr.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/klogr_slog.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/textlogger/options.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/textlogger/textlogger.gois excluded by!**/vendor/**,!vendor/**vendor/modules.txtis excluded by!**/vendor/**,!vendor/**
📒 Files selected for processing (22)
cmd/cluster-authentication-operator-tests-ext/main.gogo.modpkg/controllers/readiness/wellknown_ready_controller.gotest/e2e-encryption-kms/encryption_kms.gotest/e2e-encryption-kms/encryption_kms_test.gotest/e2e-encryption-kms/main_test.gotest/e2e-encryption-perf/encryption_perf.gotest/e2e-encryption-perf/encryption_perf_test.gotest/e2e-encryption-rotation/e2e-encryption-rotation_test.gotest/e2e-encryption-rotation/encryption_rotation.gotest/e2e-encryption/encryption.gotest/e2e-encryption/encryption_test.gotest/e2e-encryption/main_test.gotest/e2e-oidc/external_oidc.gotest/e2e-oidc/external_oidc_test.gotest/library/client.gotest/library/encryption/helpers.gotest/library/encryption/perf_helpers.gotest/library/encryption/scenarios.gotest/library/encryption_wrappers.gotest/library/featuregates.gotest/library/waits.go
💤 Files with no reviewable changes (2)
- test/e2e-encryption-kms/main_test.go
- test/e2e-encryption/main_test.go
✅ Files skipped from review due to trivial changes (1)
- go.mod
🚧 Files skipped from review as they are similar to previous changes (6)
- test/library/encryption_wrappers.go
- test/library/featuregates.go
- cmd/cluster-authentication-operator-tests-ext/main.go
- test/e2e-encryption-kms/encryption_kms_test.go
- test/e2e-encryption/encryption.go
- test/library/waits.go
aa529ed to
805bcb9
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (7)
test/e2e-encryption-kms/encryption_kms.go (2)
48-53: Typo: "techincal" should be "technical".Line 52 contains a spelling error in the comment.
📝 Fix typo
// TODO: Remove this skip once the authentication operator fully supports KMS encryption. // Currently, while the API accepts encryption.type: "KMS" and the operator mounts the KMS // plugin socket, it does not generate the EncryptionConfiguration with KMS provider stanza. // This causes tests to timeout waiting for encryption keys to be created and migration to complete. - // See: https://redhat.atlassian.net/browse/OCPSTRAT-108 it is still under techincal preview + // See: https://redhat.atlassian.net/browse/OCPSTRAT-108 it is still under technical preview t.Skip("Skipping KMS encryption test: operator implementation is incomplete")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-encryption-kms/encryption_kms.go` around lines 48 - 53, Fix the spelling mistake in the comment above the t.Skip call: change "techincal" to "technical" in the comment that references the operator being under technical preview (the comment block immediately preceding the t.Skip("Skipping KMS encryption test: operator implementation is incomplete") line).
95-100: Same typo: "techincal" should be "technical".📝 Fix typo
// TODO: Remove this skip once the authentication operator fully supports KMS encryption. // Currently, while the API accepts encryption.type: "KMS" and the operator mounts the KMS // plugin socket, it does not generate the EncryptionConfiguration with KMS provider stanza. // This causes tests to timeout waiting for encryption keys to be created and migration to complete. - // See: https://redhat.atlassian.net/browse/OCPSTRAT-108 it is still under techincal preview + // See: https://redhat.atlassian.net/browse/OCPSTRAT-108 it is still under technical preview t.Skip("Skipping KMS encryption test: operator implementation is incomplete")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-encryption-kms/encryption_kms.go` around lines 95 - 100, Fix the typo in the block comment above the t.Skip call: change "techincal" to "technical" in the sentence referencing the technical preview and the Jira ticket so the comment reads correctly; the change is in the comment surrounding the t.Skip("Skipping KMS encryption test: operator implementation is incomplete") statement in encryption_kms.go.test/library/encryption/perf_helpers.go (1)
118-120: Misleading log message when workers encounter errors.The log message "All workers completed successfully" is printed regardless of whether workers encountered errors. Since errors are collected in
r.errorStorerather than causing the function to fail, this message can be misleading during debugging.📝 Suggested improvement
r.wg.Wait() - t.Log("All workers completed successfully") + if len(r.errorStore) > 0 { + t.Logf("All workers completed with %d unique error types", len(r.errorStore)) + } else { + t.Log("All workers completed successfully") + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/library/encryption/perf_helpers.go` around lines 118 - 120, The current code always logs "All workers completed successfully" after r.wg.Wait() which is misleading because worker errors are stored in r.errorStore; update the post-wait logic to inspect r.errorStore (e.g., check its error count or equivalent method on the runner) and only emit the success message when it is empty, otherwise log or fail the test with the collected errors (use t.Logf/t.Fatalf and include the contents of r.errorStore) so the message accurately reflects worker outcomes.test/e2e-encryption-rotation/encryption_rotation.go (1)
76-76: Use the predefined constantconfigv1.EncryptionTypeAESCBCinstead of string cast.Line 76 uses
configv1.EncryptionType("aescbc"), which works but should use the constantconfigv1.EncryptionTypeAESCBCdefined in the API types. This is consistent with how the constant is used throughout the encryption test library and aligns with the codebase pattern.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-encryption-rotation/encryption_rotation.go` at line 76, Replace the string-cast usage configv1.EncryptionType("aescbc") with the predefined constant configv1.EncryptionTypeAESCBC in the EncryptionProvider field; locate the assignment in encryption_rotation.go (the EncryptionProvider assignment) and swap the value to configv1.EncryptionTypeAESCBC so the test uses the API constant consistently with other tests.test/e2e-oidc/external_oidc.go (3)
950-969: Consider cleaning up test-created Keycloak resources.The test creates a group and user in Keycloak but doesn't explicitly clean them up. If the Keycloak instance persists across test runs, these resources accumulate. If the IDP cleanup (
test.AddKeycloakIDPcleanup handlers at line 127) removes the entire Keycloak deployment, this is acceptable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-oidc/external_oidc.go` around lines 950 - 969, The test creates Keycloak resources (group and user via kcClient.CreateGroup and kcClient.CreateUser assigned to variables group and user) but never explicitly deletes them; add cleanup logic that calls the appropriate kcClient.DeleteGroup(group) and kcClient.DeleteUser(user) (or equivalent deletion methods) in a deferred or test cleanup handler after creation, or ensure these deletions are registered with the existing test.AddKeycloakIDP cleanup handlers so the group and user are removed when the Keycloak IDP is torn down.
390-439: Use context-aware HTTP requests for cancellation support.The
client.Get()calls don't support context cancellation. Consider accepting a context parameter and usinghttp.NewRequestWithContextwithclient.Do()instead. This allows the requests to be cancelled if the test context times out.♻️ Proposed fix
-func fetchIssuerJWKS(issuerURL string) (*jwks, error) { +func fetchIssuerJWKS(ctx context.Context, issuerURL string) (*jwks, error) { client := &http.Client{ Timeout: 30 * time.Second, Transport: &http.Transport{ TLSClientConfig: &tls.Config{ InsecureSkipVerify: true, MinVersion: tls.VersionTLS12, }, }, } // grab openid-configuration JSON which contains the URL of the provider's JWKS - resp, err := client.Get(issuerURL + "/.well-known/openid-configuration") + req, err := http.NewRequestWithContext(ctx, http.MethodGet, issuerURL+"/.well-known/openid-configuration", nil) + if err != nil { + return nil, fmt.Errorf("could not create request for well-known config: %v", err) + } + resp, err := client.Do(req) if err != nil { return nil, fmt.Errorf("could not get issuer OpenID well-known configuration: %v", err) } defer resp.Body.Close() // ... similar change for the second request at line 422🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-oidc/external_oidc.go` around lines 390 - 439, fetchIssuerJWKS currently uses client.Get which doesn't support cancellation; change its signature to accept ctx context.Context (e.g., fetchIssuerJWKS(ctx context.Context, issuerURL string) (*jwks, error)) and replace both client.Get calls with context-aware requests by creating req := http.NewRequestWithContext(ctx, "GET", issuerURL+"/.well-known/openid-configuration", nil) and req = http.NewRequestWithContext(ctx, "GET", oidcConfig.JwksURL, nil) and use client.Do(req) for each, preserving existing error handling and defer resp.Body.Close().
690-716: String-based JSON comparison is fragile.The exact string comparison of
auth-config.jsonwill break if field ordering changes in the operator's JSON marshaling (though Go'sjson.Marshalis currently deterministic by struct field order). Consider unmarshaling both and using semantic comparison viaequality.Semantic.DeepEqualfor more robust assertions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-oidc/external_oidc.go` around lines 690 - 716, The test currently compares auth-config.json via exact string equality (expectedAuthConfigJSON vs actualCM.Data["auth-config.json"]) which is fragile; instead unmarshal both the expectedAuthConfigJSON and actualCM.Data["auth-config.json"] into generic JSON objects (e.g., map[string]interface{} or interface{}) and use k8s.io/apimachinery/pkg/api/equality.Semantic.DeepEqual to assert semantic equality; update the assertion after fetching actualCM (in the loop that calls tc.kubeClient.CoreV1().ConfigMaps(...).Get) to unmarshal both JSON strings, check errors with require.NoError, and require.Truef(t, equality.Semantic.DeepEqual(expectedObj, actualObj), "unexpected auth-config.json contents in %s/%s: diff=%v", actualCM.Namespace, actualCM.Name, cmp.Diff(expectedObj, actualObj)) so field ordering changes won’t break the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/e2e-encryption-kms/encryption_kms.go`:
- Around line 48-53: Fix the spelling mistake in the comment above the t.Skip
call: change "techincal" to "technical" in the comment that references the
operator being under technical preview (the comment block immediately preceding
the t.Skip("Skipping KMS encryption test: operator implementation is
incomplete") line).
- Around line 95-100: Fix the typo in the block comment above the t.Skip call:
change "techincal" to "technical" in the sentence referencing the technical
preview and the Jira ticket so the comment reads correctly; the change is in the
comment surrounding the t.Skip("Skipping KMS encryption test: operator
implementation is incomplete") statement in encryption_kms.go.
In `@test/e2e-encryption-rotation/encryption_rotation.go`:
- Line 76: Replace the string-cast usage configv1.EncryptionType("aescbc") with
the predefined constant configv1.EncryptionTypeAESCBC in the EncryptionProvider
field; locate the assignment in encryption_rotation.go (the EncryptionProvider
assignment) and swap the value to configv1.EncryptionTypeAESCBC so the test uses
the API constant consistently with other tests.
In `@test/e2e-oidc/external_oidc.go`:
- Around line 950-969: The test creates Keycloak resources (group and user via
kcClient.CreateGroup and kcClient.CreateUser assigned to variables group and
user) but never explicitly deletes them; add cleanup logic that calls the
appropriate kcClient.DeleteGroup(group) and kcClient.DeleteUser(user) (or
equivalent deletion methods) in a deferred or test cleanup handler after
creation, or ensure these deletions are registered with the existing
test.AddKeycloakIDP cleanup handlers so the group and user are removed when the
Keycloak IDP is torn down.
- Around line 390-439: fetchIssuerJWKS currently uses client.Get which doesn't
support cancellation; change its signature to accept ctx context.Context (e.g.,
fetchIssuerJWKS(ctx context.Context, issuerURL string) (*jwks, error)) and
replace both client.Get calls with context-aware requests by creating req :=
http.NewRequestWithContext(ctx, "GET",
issuerURL+"/.well-known/openid-configuration", nil) and req =
http.NewRequestWithContext(ctx, "GET", oidcConfig.JwksURL, nil) and use
client.Do(req) for each, preserving existing error handling and defer
resp.Body.Close().
- Around line 690-716: The test currently compares auth-config.json via exact
string equality (expectedAuthConfigJSON vs actualCM.Data["auth-config.json"])
which is fragile; instead unmarshal both the expectedAuthConfigJSON and
actualCM.Data["auth-config.json"] into generic JSON objects (e.g.,
map[string]interface{} or interface{}) and use
k8s.io/apimachinery/pkg/api/equality.Semantic.DeepEqual to assert semantic
equality; update the assertion after fetching actualCM (in the loop that calls
tc.kubeClient.CoreV1().ConfigMaps(...).Get) to unmarshal both JSON strings,
check errors with require.NoError, and require.Truef(t,
equality.Semantic.DeepEqual(expectedObj, actualObj), "unexpected
auth-config.json contents in %s/%s: diff=%v", actualCM.Namespace, actualCM.Name,
cmp.Diff(expectedObj, actualObj)) so field ordering changes won’t break the
test.
In `@test/library/encryption/perf_helpers.go`:
- Around line 118-120: The current code always logs "All workers completed
successfully" after r.wg.Wait() which is misleading because worker errors are
stored in r.errorStore; update the post-wait logic to inspect r.errorStore
(e.g., check its error count or equivalent method on the runner) and only emit
the success message when it is empty, otherwise log or fail the test with the
collected errors (use t.Logf/t.Fatalf and include the contents of r.errorStore)
so the message accurately reflects worker outcomes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b29dbfbd-1513-45ba-a96f-a0e9944a0f1f
⛔ Files ignored due to path filters (11)
go.sumis excluded by!**/*.sumvendor/k8s.io/klog/v2/README.mdis excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/internal/serialize/keyvalues.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/internal/serialize/keyvalues_no_slog.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/internal/serialize/keyvalues_slog.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/klog.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/klogr.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/klogr_slog.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/textlogger/options.gois excluded by!**/vendor/**,!vendor/**vendor/k8s.io/klog/v2/textlogger/textlogger.gois excluded by!**/vendor/**,!vendor/**vendor/modules.txtis excluded by!**/vendor/**,!vendor/**
📒 Files selected for processing (22)
cmd/cluster-authentication-operator-tests-ext/main.gogo.modpkg/controllers/readiness/wellknown_ready_controller.gotest/e2e-encryption-kms/encryption_kms.gotest/e2e-encryption-kms/encryption_kms_test.gotest/e2e-encryption-kms/main_test.gotest/e2e-encryption-perf/encryption_perf.gotest/e2e-encryption-perf/encryption_perf_test.gotest/e2e-encryption-rotation/e2e-encryption-rotation_test.gotest/e2e-encryption-rotation/encryption_rotation.gotest/e2e-encryption/encryption.gotest/e2e-encryption/encryption_test.gotest/e2e-encryption/main_test.gotest/e2e-oidc/external_oidc.gotest/e2e-oidc/external_oidc_test.gotest/library/client.gotest/library/encryption/helpers.gotest/library/encryption/perf_helpers.gotest/library/encryption/scenarios.gotest/library/encryption_wrappers.gotest/library/featuregates.gotest/library/waits.go
💤 Files with no reviewable changes (2)
- test/e2e-encryption/main_test.go
- test/e2e-encryption-kms/main_test.go
✅ Files skipped from review due to trivial changes (2)
- go.mod
- test/library/encryption/helpers.go
🚧 Files skipped from review as they are similar to previous changes (9)
- test/library/featuregates.go
- test/library/client.go
- test/e2e-encryption/encryption.go
- pkg/controllers/readiness/wellknown_ready_controller.go
- test/e2e-encryption/encryption_test.go
- test/library/waits.go
- test/e2e-encryption-perf/encryption_perf.go
- test/library/encryption_wrappers.go
- test/library/encryption/scenarios.go
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Unrecoverable error: Failed to clone repository 🏁 Script executed: Length of output: 10050 Let me fetch the logs from the failing jobs to understand the root cause. 🏁 Script executed: Length of output: 3332 🏁 Script executed: Length of output: 50284
1.
|
| Job | Root Cause | Related to PR? |
|---|---|---|
e2e-agnostic-upgrade |
MonitorTest failure (3/10) | ❌ No |
e2e-aws-operator-encryption-perf-serial-ote-1of2 |
Post-phase job timeout (test itself passed!) | ❌ No |
e2e-aws-operator-encryption-serial-ote-2of2 |
MonitorTest failure (3/10) | ❌ No |
e2e-aws-operator-parallel-ote |
MonitorTest failure (2/10) | ❌ No |
e2e-aws-operator-serial-ote |
Mass MonitorTest failure (12/10) | ❌ No |
All failures are infrastructure-level MonitorTest violations or a CI job timeout during artifact collection — not caused by the code changes in this PR. Notably, all GCP-based encryption jobs passed, the actual test steps in failing jobs either passed or weren't reached, and the original "invalid character 'W'" JSON parsing issue fixed by this PR is not present in any of the failures.
You can safely request a re-run of the failing jobs with /retest.
|
Failure logs are not wrt the PR. Here the tc execution list PASS logs for profile: e2e-agnostic-ipv6 The dependent PR is merged: openshift/origin#31009 /verified by CI results |
|
New changes are detected. LGTM label has been removed. |
|
@ropatil010: This pull request references CNTRLPLANE-2589 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
744b2c3 to
c0f4b27
Compare
✅ Action performedFull review finished. |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/library/waits.go (1)
171-190:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not treat an empty
NodeStatuseslist as a completed rollout.If
kas.Status.NodeStatusesis empty, the loop never runs and this returns success as soon asLatestAvailableRevision != origRevision. That lets callers proceed before any apiserver instance has actually reported the new revision.Suggested fix
err := wait.PollUntilContextTimeout(ctx, 10*time.Second, 30*time.Minute, true, func(ctx context.Context) (bool, error) { kas, err := kasClient.Get(ctx, "cluster", metav1.GetOptions{}) if err != nil { t.Logf("kubeapiserver/cluster error: %v", err) return false, nil } + + if kas.Status.LatestAvailableRevision == origRevision { + return false, nil + } + if len(kas.Status.NodeStatuses) == 0 { + return false, nil + } for _, nodeStatus := range kas.Status.NodeStatuses { - if kas.Status.LatestAvailableRevision == origRevision { - return false, nil - } - if nodeStatus.CurrentRevision != kas.Status.LatestAvailableRevision { return false, nil }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/library/waits.go` around lines 171 - 190, The callback passed to wait.PollUntilContextTimeout (the anonymous func calling kasClient.Get) currently treats an empty kas.Status.NodeStatuses as success when LatestAvailableRevision != origRevision; change the callback to first check if len(kas.Status.NodeStatuses) == 0 and if so return false, nil (not success) so we wait until at least one apiserver reports; keep the existing checks for LatestAvailableRevision vs origRevision and per-node CurrentRevision vs LatestAvailableRevision and still set latestRevision = nodeStatus.CurrentRevision when iterating.
♻️ Duplicate comments (2)
test/library/keycloakidp.go (1)
246-285:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftAuthentication requests can still outlive the test context.
KeycloakClientForstill creates anhttp.Clientwithout a timeout, andAuthenticatePasswordstill useshttp.NewRequestinstead of a request-scoped context. The retry loops inAddKeycloakIDPdepend on this path, so a stalled Keycloak connection can ignore poll cancellation and hang until the job timeout. Threadctxinto auth and give the client a timeout.As per coding guidelines,
**/*.go: Go security (prodsec-skills): context.Context for cancellation and timeouts.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/library/keycloakidp.go` around lines 246 - 285, KeycloakClientFor creates an http.Client with no timeout and AuthenticatePassword builds requests without a context, allowing auth calls to outlive/circumvent caller cancellation (e.g., the AddKeycloakIDP retry/poll). Fix by threading a context.Context into AuthenticatePassword (signature change: AuthenticatePassword(ctx context.Context, ...)), use http.NewRequestWithContext(ctx, ...) and ensure callers (including AddKeycloakIDP and its retry loop) pass their ctx through; also set a sensible timeout on the client created in KeycloakClientFor (or accept a client/timeout param) so in-flight connections are bounded by caller/cancelation. Ensure TokenURL() and KeycloakClientFor references remain unchanged except for wiring the ctx and timeout.Source: Coding guidelines
test/e2e-encryption-rotation/encryption_rotation.go (1)
53-78:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHandle
Authenticationoperator update conflicts withretry.RetryOnConflict.This
Get→ mutate →Updatepath can fail onresourceVersionconflicts and make this serial rotation test flaky when the operator updates the same object concurrently.Suggested patch
import ( "context" "encoding/json" "testing" "time" @@ metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" "k8s.io/apimachinery/pkg/runtime" + "k8s.io/client-go/util/retry" @@ ForceRotationFunc: library.StaticEncryptionForceRotation(func(rawUnsupportedEncryptionCfg []byte) error { cs := operatorencryption.GetClients(t) - authOperator, err := cs.OperatorClient.Get(ctx, "cluster", metav1.GetOptions{}) - if err != nil { - return err - } - - unsupportedConfigAsMap := map[string]interface{}{} - if len(authOperator.Spec.UnsupportedConfigOverrides.Raw) > 0 { - if err := json.Unmarshal(authOperator.Spec.UnsupportedConfigOverrides.Raw, &unsupportedConfigAsMap); err != nil { - return err - } - } - unsupportedEncryptionConfigAsMap := map[string]interface{}{} - if err := json.Unmarshal(rawUnsupportedEncryptionCfg, &unsupportedEncryptionConfigAsMap); err != nil { - return err - } - if err := unstructured.SetNestedMap(unsupportedConfigAsMap, unsupportedEncryptionConfigAsMap, oauthapiconfigobservercontroller.OAuthAPIServerConfigPrefix); err != nil { - return err - } - rawUnsupportedCfg, err := json.Marshal(unsupportedConfigAsMap) - if err != nil { - return err - } - authOperator.Spec.UnsupportedConfigOverrides.Raw = rawUnsupportedCfg - - _, err = cs.OperatorClient.Update(ctx, authOperator, metav1.UpdateOptions{}) - return err + return retry.RetryOnConflict(retry.DefaultRetry, func() error { + authOperator, err := cs.OperatorClient.Get(ctx, "cluster", metav1.GetOptions{}) + if err != nil { + return err + } + + unsupportedConfigAsMap := map[string]interface{}{} + if len(authOperator.Spec.UnsupportedConfigOverrides.Raw) > 0 { + if err := json.Unmarshal(authOperator.Spec.UnsupportedConfigOverrides.Raw, &unsupportedConfigAsMap); err != nil { + return err + } + } + unsupportedEncryptionConfigAsMap := map[string]interface{}{} + if err := json.Unmarshal(rawUnsupportedEncryptionCfg, &unsupportedEncryptionConfigAsMap); err != nil { + return err + } + if err := unstructured.SetNestedMap(unsupportedConfigAsMap, unsupportedEncryptionConfigAsMap, oauthapiconfigobservercontroller.OAuthAPIServerConfigPrefix); err != nil { + return err + } + rawUnsupportedCfg, err := json.Marshal(unsupportedConfigAsMap) + if err != nil { + return err + } + authOperator.Spec.UnsupportedConfigOverrides.Raw = rawUnsupportedCfg + + _, err = cs.OperatorClient.Update(ctx, authOperator, metav1.UpdateOptions{}) + return err + }) }),🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-encryption-rotation/encryption_rotation.go` around lines 53 - 78, The Get→mutate→Update sequence on the Authentication operator (cs.OperatorClient.Get / authOperator / cs.OperatorClient.Update) can fail with resourceVersion conflicts; wrap the whole read-modify-update in a retry.RetryOnConflict(retry.DefaultRetry, func() error { ... }) loop that re-reads the authOperator inside the closure, applies the same JSON Unmarshal/SetNestedMap/Marshal mutation to authOperator.Spec.UnsupportedConfigOverrides.Raw, and then calls Update; return the Update error from the closure and return the retry wrapper result from the calling function so conflicts are retried safely.
🧹 Nitpick comments (5)
test/library/waits.go (1)
61-70: 🏗️ Heavy liftPass a caller context through this wait chain.
WaitForClusterOperatorStatusstill hardcodescontext.TODO(), soWaitForOperatorToPickUpChangesand the other wrappers can keep polling for the full 10 minutes after the enclosing test has already been canceled. Threadctxthrough this helper family and use it in bothPollUntilContextTimeoutandCheckClusterOperatorStatus.As per coding guidelines,
**/*.go: Go security (prodsec-skills): context.Context for cancellation and timeouts.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/library/waits.go` around lines 61 - 70, WaitForClusterOperatorStatus currently uses context.TODO() causing callers like WaitForOperatorToPickUpChanges to be unable to cancel the wait; change the signature of WaitForClusterOperatorStatus to accept a context.Context parameter, thread that ctx through the call to wait.PollUntilContextTimeout instead of context.TODO(), and pass the same ctx into CheckClusterOperatorStatus (and update any callers such as WaitForOperatorToPickUpChanges to forward their ctx); ensure all references to PollUntilContextTimeout and CheckClusterOperatorStatus use the passed ctx so the wait honors caller cancellation.Source: Coding guidelines
test/library/client.go (1)
168-171: ⚡ Quick winAccept a caller context for namespace creation.
This helper hardcodes
context.Background(), even though callers already track test contexts (test/library/idpdeployment.go:50-59). If the API call stalls, namespace creation ignores suite cancellation. Addingctx context.Contexthere keeps the new exported builder consistent with the rest of the context-aware helpers.As per coding guidelines,
**/*.go: Go security (prodsec-skills): context.Context for cancellation and timeouts.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/library/client.go` around lines 168 - 171, Change TestNamespaceBuilder.Create to accept a caller context (add ctx context.Context to the signature of Create) and pass that ctx into kubeClient.Create instead of using context.Background(); update the callsite(s) that use Create (e.g., in idpdeployment helpers) to forward their test context, and ensure the file imports context if not already present and the function name TestNamespaceBuilder.Create and the kubeClient.Create(...) invocation are updated accordingly.Source: Coding guidelines
test/e2e-encryption-perf/encryption_perf.go (1)
79-79: ⚡ Quick winPrefer
configv1.EncryptionTypeAESCBCover string casting.Line 79 should use the typed constant for stronger API compatibility and typo safety.
Proposed change
- APIServerEncryption: configv1.APIServerEncryption{Type: configv1.EncryptionType("aescbc")}, + APIServerEncryption: configv1.APIServerEncryption{Type: configv1.EncryptionTypeAESCBC},🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-encryption-perf/encryption_perf.go` at line 79, Replace the string-cast encryption type with the typed constant: in the APIServerEncryption struct (symbol APIServerEncryption / type configv1.APIServerEncryption) change usage of configv1.EncryptionType("aescbc") to the exported constant configv1.EncryptionTypeAESCBC to ensure type safety and avoid typos.test/e2e-encryption/encryption.go (1)
76-76: ⚡ Quick winUse the typed encryption constant instead of a raw string cast.
Line 76 should prefer the API constant to avoid typo drift and keep enum usage explicit.
Proposed change
- APIServerEncryption: configv1.APIServerEncryption{Type: configv1.EncryptionType("aescbc")}, + APIServerEncryption: configv1.APIServerEncryption{Type: configv1.EncryptionTypeAESCBC},🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-encryption/encryption.go` at line 76, Replace the raw string cast used for APIServerEncryption (configv1.APIServerEncryption{Type: configv1.EncryptionType("aescbc")}) with the typed enum constant (e.g. configv1.AESCBC) to avoid typo drift; update the value to configv1.APIServerEncryption{Type: configv1.AESCBC} so the code uses the declared constant instead of a literal string cast.test/e2e-encryption-rotation/encryption_rotation.go (1)
49-49: ⚡ Quick winUse the typed AES-CBC constant instead of a string cast.
configv1.EncryptionTypeAESCBCis clearer and avoids literal drift.Suggested patch
- APIServerEncryption: configv1.APIServerEncryption{Type: configv1.EncryptionType("aescbc")}, + APIServerEncryption: configv1.APIServerEncryption{Type: configv1.EncryptionTypeAESCBC},🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e-encryption-rotation/encryption_rotation.go` at line 49, Replace the string-cast encryption type with the typed constant: in the APIServerEncryption initialization (configv1.APIServerEncryption) use configv1.EncryptionTypeAESCBC instead of configv1.EncryptionType("aescbc") so the field reads APIServerEncryption: configv1.APIServerEncryption{Type: configv1.EncryptionTypeAESCBC}.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e-oidc/external_oidc.go`:
- Around line 396-431: Change fetchIssuerJWKS to accept a context.Context (e.g.,
func fetchIssuerJWKS(ctx context.Context, issuerURL string) ...) and replace the
two client.Get calls with context-aware requests: create
http.NewRequestWithContext(ctx, "GET",
issuerURL+"/.well-known/openid-configuration", nil) and use client.Do(req) (and
similarly for oidcConfig.JwksURL). Ensure error handling and resp.Body.Close
remain unchanged and propagate ctx from callers like testOIDCAuthentication.
In `@test/library/keycloakidp.go`:
- Around line 703-716: The loop in DeleteUserFromGroups keeps each DELETE
response body open by deferring resp.Body.Close() inside the loop; change this
to close resp.Body explicitly at the end of each iteration (and before any early
return) so each resp from kc.do is closed immediately after checking StatusCode
and error handling; locate the DeleteUserFromGroups function, remove or replace
the in-loop defer resp.Body.Close() and call resp.Body.Close() just after
handling non-nil err and again immediately before returning an error for non-204
responses (and after successful iterations) when using kc.do and userGroupsURL.
---
Outside diff comments:
In `@test/library/waits.go`:
- Around line 171-190: The callback passed to wait.PollUntilContextTimeout (the
anonymous func calling kasClient.Get) currently treats an empty
kas.Status.NodeStatuses as success when LatestAvailableRevision != origRevision;
change the callback to first check if len(kas.Status.NodeStatuses) == 0 and if
so return false, nil (not success) so we wait until at least one apiserver
reports; keep the existing checks for LatestAvailableRevision vs origRevision
and per-node CurrentRevision vs LatestAvailableRevision and still set
latestRevision = nodeStatus.CurrentRevision when iterating.
---
Duplicate comments:
In `@test/e2e-encryption-rotation/encryption_rotation.go`:
- Around line 53-78: The Get→mutate→Update sequence on the Authentication
operator (cs.OperatorClient.Get / authOperator / cs.OperatorClient.Update) can
fail with resourceVersion conflicts; wrap the whole read-modify-update in a
retry.RetryOnConflict(retry.DefaultRetry, func() error { ... }) loop that
re-reads the authOperator inside the closure, applies the same JSON
Unmarshal/SetNestedMap/Marshal mutation to
authOperator.Spec.UnsupportedConfigOverrides.Raw, and then calls Update; return
the Update error from the closure and return the retry wrapper result from the
calling function so conflicts are retried safely.
In `@test/library/keycloakidp.go`:
- Around line 246-285: KeycloakClientFor creates an http.Client with no timeout
and AuthenticatePassword builds requests without a context, allowing auth calls
to outlive/circumvent caller cancellation (e.g., the AddKeycloakIDP retry/poll).
Fix by threading a context.Context into AuthenticatePassword (signature change:
AuthenticatePassword(ctx context.Context, ...)), use
http.NewRequestWithContext(ctx, ...) and ensure callers (including
AddKeycloakIDP and its retry loop) pass their ctx through; also set a sensible
timeout on the client created in KeycloakClientFor (or accept a client/timeout
param) so in-flight connections are bounded by caller/cancelation. Ensure
TokenURL() and KeycloakClientFor references remain unchanged except for wiring
the ctx and timeout.
---
Nitpick comments:
In `@test/e2e-encryption-perf/encryption_perf.go`:
- Line 79: Replace the string-cast encryption type with the typed constant: in
the APIServerEncryption struct (symbol APIServerEncryption / type
configv1.APIServerEncryption) change usage of configv1.EncryptionType("aescbc")
to the exported constant configv1.EncryptionTypeAESCBC to ensure type safety and
avoid typos.
In `@test/e2e-encryption-rotation/encryption_rotation.go`:
- Line 49: Replace the string-cast encryption type with the typed constant: in
the APIServerEncryption initialization (configv1.APIServerEncryption) use
configv1.EncryptionTypeAESCBC instead of configv1.EncryptionType("aescbc") so
the field reads APIServerEncryption: configv1.APIServerEncryption{Type:
configv1.EncryptionTypeAESCBC}.
In `@test/e2e-encryption/encryption.go`:
- Line 76: Replace the raw string cast used for APIServerEncryption
(configv1.APIServerEncryption{Type: configv1.EncryptionType("aescbc")}) with the
typed enum constant (e.g. configv1.AESCBC) to avoid typo drift; update the value
to configv1.APIServerEncryption{Type: configv1.AESCBC} so the code uses the
declared constant instead of a literal string cast.
In `@test/library/client.go`:
- Around line 168-171: Change TestNamespaceBuilder.Create to accept a caller
context (add ctx context.Context to the signature of Create) and pass that ctx
into kubeClient.Create instead of using context.Background(); update the
callsite(s) that use Create (e.g., in idpdeployment helpers) to forward their
test context, and ensure the file imports context if not already present and the
function name TestNamespaceBuilder.Create and the kubeClient.Create(...)
invocation are updated accordingly.
In `@test/library/waits.go`:
- Around line 61-70: WaitForClusterOperatorStatus currently uses context.TODO()
causing callers like WaitForOperatorToPickUpChanges to be unable to cancel the
wait; change the signature of WaitForClusterOperatorStatus to accept a
context.Context parameter, thread that ctx through the call to
wait.PollUntilContextTimeout instead of context.TODO(), and pass the same ctx
into CheckClusterOperatorStatus (and update any callers such as
WaitForOperatorToPickUpChanges to forward their ctx); ensure all references to
PollUntilContextTimeout and CheckClusterOperatorStatus use the passed ctx so the
wait honors caller cancellation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: aeeac077-120e-410e-9f38-78cfdf7bdc30
📒 Files selected for processing (20)
cmd/cluster-authentication-operator-tests-ext/main.gotest/e2e-encryption-perf/encryption_perf.gotest/e2e-encryption-perf/encryption_perf_test.gotest/e2e-encryption-rotation/e2e-encryption-rotation_test.gotest/e2e-encryption-rotation/encryption_rotation.gotest/e2e-encryption/encryption.gotest/e2e-encryption/encryption_test.gotest/e2e-encryption/main_test.gotest/e2e-oidc/external_oidc.gotest/e2e-oidc/external_oidc_test.gotest/e2e/keycloak.gotest/library/client.gotest/library/encryption/helpers.gotest/library/encryption/perf_helpers.gotest/library/encryption/scenarios.gotest/library/encryption_wrappers.gotest/library/featuregates.gotest/library/idpdeployment.gotest/library/keycloakidp.gotest/library/waits.go
54487ef to
42f050f
Compare
|
Below Failure profiles are related to monitor cases.
Hi @liouk PTAL on this PR. Any suggestion would you like to give? /test e2e-aws-operator-encryption-perf-serial-ote-2of2 |
There was a problem hiding this comment.
As far as I can see, the direction of the most recent changes isn't the right one:
-
Adding
[Disruptive]to encryption tests breaks the dedicated encryption OTE suites. The encryption suite qualifiers inmain.gofilter on[Encryption]and[Serial]but explicitly exclude[Disruptive](via!name.contains("[Disruptive]")). By adding[Disruptive]to these tests, they'll be
excluded from their own suites and instead fall into the operator/disruptive suite. The encryption-specific OTE jobs (operator-encryption/serial,operator-encryption-rotation/serial,
operator-encryption-perf/serial) will have zero tests to run. I don't think we need to add[Disruptive]to the tests, we should instead set cluster stability to disruptive; these are two different things (one is a tag used for test filtering, the other is a suite property that affects what monitor tests are executed on the suite). -
The
e2e-gcp-operator-disruptivejob fails on[Monitor:known-image-checker]Only known images used by tests -- the Keycloak/GitLab images aren't in the allowed list. Making the security context restricted
doesn't fix this. Even with this commit, the OIDC test still runs in the disruptive suite and the image check will still fail. Setting cluster stability to disruptive for the OIDC tests is the solution here as well, as it will not check for known images, but at the same time the OIDC suite is indeed disruptive as it currently rolls out the kube-apiserver. -
This change contains excessive comments, where in most cases we could do with 1-2 lines. The code is self explanatory and in most cases the comments don't add any information. Additionally, there seem to be a few misplaced comments in
keycloakidp.go.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
750d010 to
853094c
Compare
Migrated authentication operator e2e tests from library-go to OTE framework. - Converted test functions to accept testing.TB for better compatibility - Fixed golint issues: context parameter ordering, ID naming, documentation - Updated Keycloak security context to use restricted PSA instead of privileged SCC Fixes for e2e-oidc test compatibility: - Fixed context handling: Use context.WithCancel instead of context.TODO() to ensure proper cancellation when test completes or times out - Replaced g.By() with t.Logf() for dual-mode compatibility (works with both standalone go test and Ginkgo/OTE framework) These fixes resolve CI failures in e2e-oidc and e2e-oidc-techpreview jobs. Co-Authored-By: Rohit Patil <[email protected]> Fix PodSecurityViolation and pathological event monitor test failures Changes: - Make IDP security context configurable (privileged vs restricted) - GitLab requires privileged mode (manages system services as root) - Keycloak works with restricted mode (runs as non-root) - Set ClusterStability to Disruptive for encryption test suites - Encryption tests trigger API server rollouts causing temporary disruption events This fixes monitor test failures: - [Monitor:audit-log-analyzer] PodSecurityViolation (50% reduction) - [Monitor:legacy-test-framework-invariants-pathological] events should not repeat pathologically Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
853094c to
057237e
Compare
|
@ropatil010: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Hi Team,
This PR migrates authentication operator e2e tests from library-go to the OpenShift Tests Extension (OTE) framework and fixes multiple CI monitor test failures discovered during migration.
Part 1: OTE Framework Migration
Part 2: CI Monitor Test Failures (Critical Fixes)
After migration, the new
e2e-aws-operator-serial-otejob exposed previously hidden monitor test failures.Fix #1: PodSecurityViolation Monitor Failures ✅ (50% Reduction)
[Monitor:audit-log-analyzer] [bz-apiserver-auth][invariant] audit analysis PodSecurityViolation
[Monitor:legacy-test-framework-invariants-alerts] alert/PodSecurityViolation should not be at or above info
Now all three PSA labels are set consistently, eliminating audit log violations.
test/library/idpdeployment.go):createContainerSecurityContext()helperImpact:
Problem:
Encryption tests failing with pathological event errors:
Root Cause:
apiServer.Spec.Encryption→ triggers API server rolloutsFailedScheduling(pods can't schedule during rollout)ReadinessProbeFailed(new pods not ready yet)[Disruptive]Solution:
Added
[Disruptive]label to all encryption test suites:Applied to:
test/e2e-encryption/encryption.gotest/e2e-encryption-perf/encryption_perf.gotest/e2e-encryption-rotation/encryption_rotation.gotest/e2e-encryption-kms/encryption_kms.goWhy This Works:
ClusterStabilityDisruptive→ framework runsnewDisruptiveMonitorTests()Impact: Pathological event failures ELIMINATED
Why it remains:
Justification:
Failure:
Why not fixed here:
openshift/originrepositorytest/extended/util/image/README.mdin openshift/origin